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Statement  of  the  problem  studied 

RNA  viruses  can  rapidly  mutate,  causing  therapeutics  and  vaccines  to  loose  their 
effectiveness.  The  long-term  goal  of  this  project  is  to  predict  such  mutations,  in  order  to 
anticipate  their  effect  and  design  better  therapeutics  and  vaccines.  In  the  funding  period 
reported  here,  the  specific  goal  was  to  build  a  predictive  model  of  viral  escape  from 
immune  pressure  exerted  by  monospecific  T  cells  in  vitro.  The  approach  chosen  was  to 
build  a  predictive  model  based  on  data  from  the  literature,  while  simultaneously 
generating  a  new  experimental  data  set  to  parameterize  and  test  the  predictive  models. 

Summary  of  the  most  important  results 

The  project  goals  were  specified  in  five  milestones,  building  on  each  other. 

1)  To  produce  and  characterize  T  cell  lines  specific  for  two  LCMV  epitopes 

2)  To  generate  a  panel  of  single  substitution  analogs  of  these  two  epitopes 

3)  To  perform  an  entropy  analysis  of  the  LCMV  NP  protein  by  directed  sequencing  of  a 
large  number  of  viral  isolates 

4)  To  construct  a  model  that  will  predict,  based  on  the  data  from  milestones  2  and  3, 
which  mutations  are  most  likely  to  appear  in  wild  type  LCMV  when  it  is  co-cultured 
with  the  T  cell  lines  generated  in  milestone  1 
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5)  To  generate  in  vitro  LCMV  mutants  by  exerting  immune  pressure,  and  verify  the 
accuracy  of  the  prediction  generated  in  milestone  4 

The  following  summarizes  the  most  important  results  of  the  project  grouped  by  milestone. 
For  a  detailed  description  including  alternative  approaches  that  weren’t  successful,  please 
refer  to  the  quarterly  reports. 

Milestone  1)  To  produce  and  characterize  T  cell  lines  specific  for  two  LCMV 
epitopes 

The  milestone  was  completed  successfully.  We  established  a  reproducible  source  of  T 
cells  established.  Of  multiple  experimental  approaches  tried,  the  following  protocol  gave 
the  best  results:  Mice  where  immunized  with  two  known  peptide  epitopes  (NP236  and 
NP205)  from  LCMV.  The  mice  were  sacrificed  10  days  after  immunization,  and  CD8+  T  cells 
were  purified  from  their  splenocytes.  The  epitope  specificity  of  the  T  cells  was 
characterized  and  confirmed  by  ICCS  and  ELISPOT  assays.  These  showed  a  strong 
response  was  induced  by  NP396,  while  a  low  response  was  induced  by  NP205.  This  is  in 
agreement  with  their  previous  characterization  as  a  dominant  and  subdominant  epitope 
(Figure  1). 


IFN-gamma  ELISPOT 
400  -r  - 


Media  NP  205  Media  NP  396 
stimulating  peptide 


Figure  1:  Induction  of  epitope  specific  CD8+  T  cell  response  following  peptide 
immunization.  Responses  were  determined  in  a  direct  ex  vivo  ELISPOT  assay,  and 
are  measured  in  spot  forming  cells  (SFC)  per  million  CD8+  T  cells.  The  three  bars 
indicate  the  results  from  different  repeats. 

Milestone  2)  To  generate  a  panel  of  single  substitution  analogs  of  these  two 
epitopes. 

The  milestone  was  completed  successfully.  We  comprehensively  characterized  the 
impact  of  single  residue  mutations  in  epitopes  on  their  T  cell  recognition.  We 

synthesized  a  complete  set  of  single  residue  substitution  peptides,  and  tested  each  peptide 
for  the  ability  to  stimulate  the  T  cells  generated  in  milestone  1  in  ELISPOT  assays  (Figure 
2).  Based  on  the  data,  the  peptides  could  be  classified  into  strong,  intermediate  and  weakly 


2 


cross-reactive  in  terms  of  IFN-gamma  production.  The  pattern  clearly  shows  that  that 
positions  4  -  8  of  the  NP  396  and  positions  1  and  4  of  the  NP  205  have  the  strongest 
influence  on  T  cell  recognition. 


Figure  2  -  T  cell  cross-reactivity  of  peptides  with  single  residue  mutations 
from  the  original  epitopes.  The  top  row  of  each  table  gives  the  original  epitope 
sequence.  Each  cell  in  the  matrix  below  contains  data  from  a  single  residue 
substitution  peptide  specified  by  the  position  of  the  column  in  the  original  peptide 
and  the  residues  listed  in  the  leftmost  column.  The  color  assigned  corresponds  to 
strong(green),  intermediate  (gray)  and  weakly  (red)  crossreactive  peptides. 


Going  beyond  the  originally  proposed  milestone,  we  wanted  to  determine  how  many  of  the 
peptides  do  not  cross-react  because  they  have  lost  the  ability  to  bind  MHC  molecules.  MHC 
binding  assays  were  run  for  all  peptide  analogs  to  deconvolute  the  effect  of  differences 
in  binding  affinity  with  differences  in  TCR  recognition  (Figure  3).  The  positions  affecting 
binding  most  (such  as  the  C-termini)  are  not  the  same  as  those  determining  T  cell  cross¬ 
reactivity  in  Figure  2.  This  supports  the  notion  of  a)  limited  contacts  to  TCRs  of  residues 
buried  in  the  MHC  binding  groove,  leading  the  T  cells  to  ignore  those  substitutions,  b)  even 
low  affinity  binder  are  able  to  stimulate  a  pre-existing  T  cell  response  in  an  IFN-gamma 
ELISTPOT  assay.  If  the  latter  would  hold  in  vivo  would  have  to  be  further  analyzed,  but  is 
doubtful. 
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Figure  3  -  MHC  binding  affinity  of  peptides  with  single  residue  mutations  from 
the  original  epitopes.  The  top  row  of  each  table  gives  the  original  epitope 
sequence.  Each  cell  in  the  matrix  below  contains  data  from  a  single  residue 
substitution  peptide  specified  by  the  position  of  the  column  in  the  original  peptide 
and  the  residues  listed  in  the  leftmost  column.  The  number  in  each  cell  is  the 
competitive  binding  affinity  measured  in  IC50  [nM].  Weakly  binding  peptides  with 
IC50  greater  than  500nM  are  colored  in  red. 


Milestone  3)  To  perform  an  entropy  analysis  of  the  LCMV  NP  protein  by  directed 
sequencing  of  a  large  number  of  viral  isolates 

This  milestone  was  completed  successfully,  but  much  later  in  the  project  than 
originally  planned.  We  were  kindly  supplied  with  viral  stocks  from  Jason  Botten  /  Mike 
Buchmeier  at  Scripps.  These  are  stocks  from  a  long  term  culture  of  LCMV  with  60+ 
passages  in  MC57  cells.  These  are  many  more  than  passages  than  we  could  perform 
ourselves  during  this  project.  We  retrieved  the  initial  stock  (P0),  day  45  (P10)  and  day  178 
(PMax),  isolated  viral  RNA.  Full  length  genomes  were  sequenced  using  standard 
methods  at  each  time  point,  which  identified  the  mutations  listed  in  Table  1. 


Protein 

Codon 

AA  change 

P10 

PMax 

(p-illumina) 

NP 

432 

L->F 

X 

X 

<0.001% 

NP 

444 

G->D 

X 

0.11 

L 

1365 

(silent) 

X 

0.13 

L 

1711 

E->K 

X 

X 

<0.001% 

L 

1713 

L-  >S 

X 

0.00005 

Table  1  -  Mutations  identified  in  the  LCMV  consensus  sequence  by  standard 
sequencing  methods.  The  last  column  identifies  the  p-value  associated  with  these 
mutations  in  the  Illumina  method  (Table  2  below). 
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The  number  of  mutations  identified  in  the  consensus  sequence  even  after  63 
passages  was  low,  raising  concerns  that  the  sequence  resolution  we  were  hoping  to 
achieve  by  sequencing  50+  individual  clones  would  be  insufficient.  We  instead  decided  to 
apply  novel  sequencing  techniques  that  give  direct  data  on  the  sequence  population  in  a 
sample  without  having  to  isolate  clones.  This  can  potentially  identify  mutations  present  at  a 
much  lower  frequency. 


We  decided  to  utilize  the  sequencing  services  at  Illumina,  which  required  less  input 
material  compared  to  the  competing  454  technology,  and  promised  a  quicker  turn  around 
time.  Our  project  was  the  first  applying  this  technology  to  viral  RNA.  Several  repeats  in  the 
sample  preparation  and  sequencing  were  necessary  before  we  received  results.  Those 
results  showed  the  presence  of  a  Mycoplasma  co-infection  in  the  viral  culture.  As  the 


sequencing  technology  directly  reflects  the 
genomic  content  in  the  sample,  this  led  to  a 
reduced  amount  of  LCMV  specific 
sequencing  data. 

We  developed  a  data  cleaning  algorithm, 
selecting  sequence  reads  that  a)  meet  a  high 
illumina  quality  score  cutoff,  b)  have  a 
significant  alignment  against  the  LCMV 
genome  c)  have  no  better  alignment  against 
the  mycoplasma  genome  and  d)  fall  within 
the  protein  coding  region  of  LCMV.  A  total 
of  1.473k  nucleotide  calls  meet  these 
criteria,  of  which  98.7%  are  consensus 
calls  and  1.3%  are  mutation  calls.  If  the 
mutations  were  randomly  distributed 
sequencing  errors,  they  should  follow  a 
Poisson  distribution  at  each  nucleotide 
position.  All  positions  were  picked  that  have 
a  significant  higher  number  of  mutation  calls 
than  expected  by  the  Poisson  distribution 
with  a  p-value  cutoff  p  <  10A-5.  This 
identified  29  nucleotide  positions  with  a 
high  number  of  mutations  present 
compared  to  0.3  that  would  have  been 
expected  by  chance  (Table  2). 


Protein 

Codon 

mutation 

P10 

Pmax 

GP 

33 

(silent) 

X 

GP 

77 

V  ->  A 

X 

X 

GP 

203 

Y  ->  H 

X 

GP 

208 

W  ->  R 

X 

GP 

424 

D  ->  G 

X 

GP 

431 

S  ->  N 

X 

GP 

432 

T  ->  A 

X 

GP 

432 

(silent) 

X 

NP 

160 

W  ->  C 

X 

NP 

162 

V  ->  G 

X 

NP 

225 

L  ->  P 

X 

NP 

300 

E  ->  K 

X 

NP 

301 

N  ->  D 

X 

NP 

325 

(silent) 

X 

NP 

325 

R  ->  G 

X 

NP 

432 

L  ->  F 

X 

NP 

498 

L  ->  V 

X 

X 

NP 

514 

1- 

A 

1 

1— 1 

X 

X 

NP 

515 

T  ->  P 

X 

L 

5 

(silent) 

X 

L 

151 

F  ->  L 

X 

L 

959 

(silent) 

X 

L 

1002 

(silent) 

X 

L 

1107 

(silent) 

X 

L 

1711 

E  ->  K 

X 

L 

2056 

(silent) 

X 

Table  2  -  Mutations  identified  in  the 
LCMV  viral  population  by  the  Illumina 
sequencing  method 
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The  sequencing  data  generated  for  this  milestone  corresponds  to  about  140  full 
length  genomic  sequences  of  LCMV,  much  more  than  originally  planned.  However,  due 
to  the  change  in  experimental  approach,  initial  difficulties  in  the  sample  preparation  and 
the  Mycoplasma  co-infection  this  data  was  only  available  at  the  end  of  the  funding  period, 
and  could  not  be  used  as  planned  in  milestones  4  and  5. 


Milestone  4:  To  construct  a  model  that  will  predict,  based  on  the  data  from  milestones  2 

and  3,  which  mutations  are  most  likely  to 
appear  in  wild  type  LCMV  when  it  is  co¬ 
cultured  with  the  T  cell  lines  generated  in 
milestone  1. 


n,  (0  =  MijAJnJ  W  “  An'  W 

j 

K0  =  v  ~  r  M  (P + Z  MjAinj  (0) 


r  =  resource,  r),  (3  =  replenishing,  depletion  rate 
n(  =  number  of  viruses  i 
A;  =  Fitness  of  virus  i  =  amino  acid  fitness 
Mjj  =  Mutation  j  -4  i  =  nucleotide  exchange 
Ar  =  Death  rate 

Figure  4  -  Central  rate  equation 


This  milestone  was  completed 
successfully.  A  mathematical  model  of 
viral  evolution  was  developed  based  on 
combining  the  theoretical  approaches  of 
the  Quasispecies  and  sequence  evolution 
models.  The  central  set  of  equations  is 
shown  in  Figure  4. 


The  primary  novel  elements  of  this  approach  are  a)  to  treat  evolution  at  each  site 
independently  b)  explicitly  model  evolution  at  the  nucleotide,  codon  and  amino  acid  level, 
c)  distinguish  between  random  mutations  and  fitness  mediated  selection  d)  limiting  the 
viral  particle  counts  in  steady  state  to  finite  numbers  by  assuming  all  particles  compete  for 
a  common  resource  (such  as  space).  Figure  5  shows  a  simulation  of  the  total  viral  particle 
number  dynamics  compared  to  the  measured  time  course. 


Figure  5  -  Dynamics  of  modeled  and  measured  total  viral  particle  numbers. 

The  modeled  (left  panel)  and  measured  (right  panel)  viral  particle  numbers  (y-axis) 
are  plotted  as  a  function  of  time  t  (x-axis).  The  modeled  time  courses  display 
relaxation  oscillations  similar  to  those  observed  in  the  viral  titers  of  the  long  term 
LCMV  culturing  data. 
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The  rate  equations  given  above  describe  the  average  dynamic  of  an  infinite  number  of  viral 
populations.  For  an  individual  process,  the  dynamic  expansion  has  to  include  stochastic 
terms,  which  take  into  account  variations  of  replication,  mutation  and  destruction  events. 
This  was  done  by  introducing  Langevin  terms  into  the  time-discretized  version  of  the 
equations.  Two  separate  time  courses  of  simulation  runs  are  shown  here: 


0  2000  4000  6000  8000 


Figure  5  -  Dynamics  of  modeled  particle  numbers  for  different  members  of  a 
viral  population.  The  viral  particle  numbers  (y-axis)  are  plotted  as  a  function  of 
time  t  (x-axis).  Each  colored  line  represents  the  viral  population  associated  with  a 
single  nucleotide  mutation  to  the  wild  type  (blue).  At  time  4000,  the  selective 
pressure  of  T  cells  monospecific  for  the  wild  type  epitope  are  added  into  the  model. 
This  gives  mutants  that  are  not  recognized  by  T  cells  an  advantage  over  the  wild 
type  sequence. 

Repeating  these  simulations  200  times  gives  a  distribution  of  escape  mutants  becoming 
dominant.  Table  3  lists  the  more  frequently  observed  escape  mutations  in  these  200 
repeats.  These  are  the  model  predictions  utilized  in  milestone  5  of  what  escape 
mutations  arise  as  a  result  of  added  selective  pressure. 
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Table  3  -  Predicted  viral  escape  mutations  arising  due  to  immune  pressure. 

The  top  row  gives  the  wild  type  epitope  sequences.  Letters  in  bold  underneath 
occurred  10  or  more  times  (5%).  Letters  below  at  the  bottom  occurred  between  2 
and  9  times  (1%). 

Milestone  5:  To  generate  in  vitro  LCMV  mutants  by  exerting  immune  pressure, 
and  verify  the  accuracy  of  the  prediction  generated  in  milestone  4. 

This  milestone  was  not  completed  as  planned,  as  the  problems  with  obtaining  sequences 
from  viral  cultures  were  only  solved  at  the  end  of  the  project  period.  As  we  had  resources 
available,  we  further  optimized  the  generation  of  monospecific  T  cells  for  the  purpose  of 
this  milestone.  We  had  purchased  MHC  tetramers  with  the  NP396  epitope,  which  allowed 
us  to  sort  out  epitope  specific  T  cells.  We  compared  several  experimental  approaches  to 
generate  epitope  specific  T  cells  through  an  infection  first,  followed  by  in  vitro 
restimulation.  Optimal  results  were  obtained  by  infecting  mice  with  recombinant  vaccinia 
viruses  expressing  the  LCMV  NP  protein,  and  then  expanding  the  generated  T  cells  through 
in  vitro  culture.  Figure  5  shows  the  greatly  higher  yield  of  that  approach  compared  to 
taking  cells  from  LCMV  infected  animals. 
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Figure  6  -  Optimized  protocol  to  generate  monospecific  T  cells.  The  panels 
show  FACS  tetramer  stainings  of  epitope  specific  IFN-gamma  producing  T  cells  from  LCMV 
infected  (left)  and  recombinant  vaccinia  virus  infected  (right)  animals. 
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As  an  alternative  approach  to  test  the  predictive  capacity  of  the  model  from  milestone  4,  we 
utilized  escape  mutation  data  previously  published  study  by  Oldstone  [1,  2].  In  those 
experiments,  the  authors  cultured  viral  populations  with  T  cells  monospecific  for  the 
NP396  epitope,  and  isolated  viral  clones  that  escaped  immune  detection.  The  mutation 
identified  in  the  experimental  study  is  NP403  F  L.  In  our  model  (table  3),  this  is 
also  one  of  the  frequently  occurring  escape  mutations.  This  agreement  provides 
evidence  that  the  mutations  predicted  in  our  model  are  indeed  those  occurring  more 
frequently  in  the  experiment. 
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